Deep Reinforcement Learning for Model Predictive Controller Based on Disturbed Single Rigid Body Model of Biped Robots
نویسندگان
چکیده
This paper modifies the single rigid body (SRB) model, and considers swinging leg as disturbances to centroid acceleration rotational of SRB model. proposes deep reinforcement learning (DRL)-based model predictive control (MPC) resist leg. The DRL predicts swing disturbances, then MPC gives optimal ground reaction forces according predicted disturbances. We use proximal policy optimization (PPO) algorithm among methods since it is a very stable widely applicable algorithm. It an on-policy based on actor–critic framework. simulation results show that improved PPO-based method can accurately predict disturbance, making locomotion more robust.
منابع مشابه
mortality forecasting based on lee-carter model
over the past decades a number of approaches have been applied for forecasting mortality. in 1992, a new method for long-run forecast of the level and age pattern of mortality was published by lee and carter. this method was welcomed by many authors so it was extended through a wider class of generalized, parametric and nonlinear model. this model represents one of the most influential recent d...
15 صفحه اولRobust Trajectory Free Model Predictive Control of Biped Robots with Adaptive Gait Length
This paper employs nonlinear disturbance observer (NDO) for robust trajectory-free Nonlinear Model Predictive Control (NMPC) of biped robots. The NDO is used to reject the additive disturbances caused by parameter uncertainties, unmodeled dynamics, joints friction, and external slow-varying forces acting on the biped robots. In contrary to the slow-varying disturbances, handling sudden pushing ...
متن کاملOperation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کاملDeep Episodic Value Iteration for Model-based Meta-Reinforcement Learning
We present a new deep meta reinforcement learner, which we call Deep Episodic Value Iteration (DEVI). DEVI uses a deep neural network to learn a similarity metric for a non-parametric model-based reinforcement learning algorithm. Our model is trained end-to-end via back-propagation. Despite being trained using the model-free Q-learning objective, we show that DEVI’s model-based internal structu...
متن کاملUsing BELBIC based optimal controller for omni-directional threewheel robots model identified by LOLIMOT
In this paper, an intelligent controller is applied to control omni-directional robots motion. First, the dynamics of the three wheel robots, as a nonlinear plant with considerable uncertainties, is identified using an efficient algorithm of training, named LoLiMoT. Then, an intelligent controller based on brain emotional learning algorithm is applied to the identified model. This emotional l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machines
سال: 2022
ISSN: ['2075-1702']
DOI: https://doi.org/10.3390/machines10110975